LOCALITY AND LOOP SCHEDULING ON NUMAMULTIPROCESSORSHui

نویسندگان

  • Hui Li
  • Sudarsan Tandri
  • Michael Stumm
  • Kenneth C. Sevcik
چکیده

An important issue in the parallel execution of loops is how to partition and schedule the loops onto the available processors. While most existing dynamic scheduling algorithms manage load imbalances well, they fail to take locality into account and therefore perform poorly on parallel systems with non-uniform memory access times. In this paper, we propose a new loop scheduling algorithm, Locality-based Dynamic Scheduling (LDS), that exploits locality, and dynamically balances the load.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scheduling of Wavefront Parallelism on Scalable Shared-memory Multiprocessors

Tiling exploits temporal reuse carried by an outer loop of a loop nest to enhance cache locality. Loop skewing is typically required to make tiling legal. This restricts parallelism to wavefronts in the tiled iteration space. For a small number of processors, wavefront parallelism can be efficiently exploited using dynamic selfscheduling with a large tile size. Such a strategy enhances intratil...

متن کامل

Feedback Guided Dynamic Loop Scheduling: Algorithms and Experiments

Dynamic loop scheduling algorithms can suuer from overheads due to synchronisation, loss of locality and small iteration counts. We observe that timing information from previous executions of the loop can be utilised to reduce these overheads. We introduce two new algorithms for dynamic loop scheduling which implement this type of feedback guidance, and report experimental results on a distribu...

متن کامل

Program Transformations for Cache Locality Enhancement on Shared - memory

Program Transformations for Cache Locality Enhancement on Shared-memory Multiprocessors Naraig Manjikian Doctor of Philosophy Graduate Department of Electrical and Computer Engineering University of Toronto 1997 This dissertation proposes and evaluates compiler techniques that enhance cache locality and consequently improve the performance of parallel applications on shared-memory multiprocesso...

متن کامل

An Analytical Model-Based Auto-tuning Framework for Locality-Aware Loop Scheduling

HPC developers aim to deliver the very best performance. To do so they constantly think about memory bandwidth, memory hierarchy, locality, floating point performance, power/energy constraints and so on. On the other hand, application scientists aim to write performance portable code while exploiting the rich feature set of the hardware. By providing adequate hints to the compilers in the form ...

متن کامل

Extending Pluto-Style Polyhedral Scheduling with Consecutivity

The Pluto scheduler is a successful polyhedral scheduler that is used in one form or another in several research and production compilers. The core scheduler is focused on parallelism and temporal locality and does not directly target spatial locality. Such spatial locality is known to bring performance benefits and has been considered in various forms outside and inside polyhedral compilation....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993